AViNet: Diving Deep into Audio-Visual Saliency Prediction
Papers With Codeでトップ
https://arxiv.org/pdf/2012.06170v1.pdf